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Abstract. The problem of testing a linear temporal logic (LTL) for- 
mula on a finite execution trace of events, generated bv an executing 
program, occurs naturally in runtime analysis of software. We present 
an algorithm which takes an LTL formula and generates an efficient dy- 
namic programming algorithm. The generated algorithm tests whether 
the LTL formula is satisfied by a finite trace of events given as input. The 
generated algorithm runs in linear time, its constant depending on the 
dze of the LTL formula. The memory needed is constant, also depending 
on tlie size of the formula. 


1 Introduction 

The work presented in this paper is part of an ambitious project ar NASA Ames 
Research Center, called PathExplorer, that aims at developing a practical 
testing environment for NASA software developers. The basic idea of the project 
is to extract an execution trace of a concurrent program and then analyze it to 
detect errors. The errors we are considering at this stage are deadlocks, data 
races, and non-conformance with linear temporal logic specifications. Only the 
later issue is addressed in this paper. 

Linear Temporal Logic (LTL) [17] is a logic for specifying properties of re- 
active and concurrent systems. The models of LTL are infinite execution traces, 
reflecting the behavior of such systems as ideally always being ready to respond 
to requests, operating systems being a typical example. LTL has "been mainly 
used to specify properties of concurrent and interactive down-scaled models of 
real systems, so that fully formal correctness proofs could subsequently be car- 
ried out, for example using theorem provers or model checkers (see for example 
[12,9]). However, such formal proof techniques are usually not scalable to real 
sized systems without a substantial effort to abstract the system more or less 
manually to a model which can be analyzed. Model checking of programs has 
received an increased attention from the formal methods community within the 
last couple of years, and several systems have emerged that can directly model 
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‘ TS f-^I rr . v fo Avoid rhc abstraction process )>y not storing states. Although 
rli(‘sr systems provide high confidence, they scale less w<dl because most of their 
internal algorithms are NP-romplete or worse. 

Toting scales well, and is by tar the most used technique in practice to 
validate software systems. The merge of testing and temporal logic specification 
is an attempt to achieve the benefits of both approaches, while avoiding some of 
rhc pitfalls of adhoc resting and the complexity of full-blown theorem proving 
and model checking. Of course then' is a price to pay in order to obtain a 
scalable teelmitjue: the loss of coverage. The suggested framework can only be 
used to examine single execution traces, and can therefore not be used to prove 
a system correct. Our work is based on the belief that software engineers are 
willing to trade coverage for scalability so our goal is to provide tools that are 
completely automatic, implement very efficient algorithms and find many errors 
in programs. As mentioned previously, the work presented in this paper is part 
of huger effort to. develop a set of dynamic analysis algorithms and to integrate 
these into a single tool named PathExplorer. Of particular additional interest 
are for example algorithms that can detect deadlock and data race potentials in 
a program, by examining a single arbitrary execution trace of the program, even 
though these errors do not occur in that trace. This can be achieved bv analyzing 
the way locks are acquired and released. A deadlock potential can for example 
be detected by observing that two threads take two locks in different order. 

A collection of commercial tools already provide this kind of analysis: Visual 
Threads [<]. which uses the Eraser algorithm [IS] for detecting data races, and 
hich works on C and C + + programs using Pthreads; Assure [1], which works on 
C++ programs using Pthreads: and finally Jprobe [19] for Java. In earlier work, 
wo implemented data race detection and deadlock detection algorithms for Java 
in Java PathFinder [3|. Its our intention to extend tins kind of technology by 
identify ing other error patterns that can be detected this way. A major goal is to 
make PathExplorer adjustable to various programming languages and thus 
eventually deliver a Java PathExplorer as well as a C++ PathExplorer 
that share the same core algorithms but have different front ends. A longer term 
goal is to explore the use of conformance with a formal specification to achieve 
fault tolerance. The idea is that the failure may' trigger a recovery action in the 
monitored program. 

Following encouraging results using rewriting-based algorithms [11] imple- 
mented in Maude [2], in this paper we investigate more efficient algorithms for 
testing whether finite execution traces conform to LTL formulae. The idea of 
using LTL in program testing is not new. It. has already been pursued in com- 
mercial tools such as TempRover (TR) [5], which has admittedly motivated us 
in a major way' to start this work. In TR, one states LTL properties as anno- 
tations of the program, these being then replaced by appropriate code, that is 
executed whenever reached 1 . Thus, TR can be seen as an extension of a conven- 
tional programming language with LTL instructions. Inspired by the MaC [15] 
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roui, w»‘ » l« *<*i< h *« | to rather niiiomalicnlly instrument iii*‘ bytrmde or the object 
ro<|r of a program to generate events of interest during tin* pxmitiou. Si h h an 
event -based framework is \v» ■!! suit'd tor program fracing in general, and it;is 
also been used to detect race conditions and deadlocks in rhr V isual Threads 
[' * I'?) and Java PathFinder [,S] fools. The trace of events can then he analyzed 
using a different tool which can e\ r en run on a different platform. One ran also 
save various execution traces of a program and then have someone els* 1 analyze 
them at a different tirin', in a different place. We were thus rapidly faced with 
tin* following challenge: 

Given a finite execution trace t of events and an LTL formula r \ how effi- 
ciently can one test whether t satisfies 

A potential solution would he to translate the formula into an automaton and 
then take the synchronized product of the automaton and the execution trace. 
This is for example how Biichi automata are used in explicit-state model checkers 
for representing formulae [13.6]. A Biichi automaton is a special automaton 
which accepts infinite traces (words): certain states are designated as acceptance 
states, and an infinite trace is in the language of the automaton if and only if 
it brings the automaton through an acceptance state infinitely often. A model 
checker can detect such infinite traces by hashing states and detect cycles that 
include acceptance states. We have decided not to use Biichi automata for a 
number of reasons. 

- First, the translation of LTL formulae to Biichi automata is not trivial, 
especially if one strives for small automata. It is worth mentioning that 
other similar systems like Temporal Rover [5] and MaC [13] do not use 
Biichi automata either. 

- Second, at a semantic level. Biichi automata are interpreted over infinite 
traces and it is not clear how to interpret them on finite traces. Consider for 
example a property such as Ofp — >■ C></) . the automaton A generated from 
the formula, and a finite trace t that, according to the semantics, satisfies the 
formula. The naive suggestion would be to drive the automaton A by t until 
tlie end of the trace, and then observe whether the automaton is in an ac- 
ceptance state or not. This will, however, generally not work. In experiments 
made using the LTL-toBuchi automata translator in the SPIN system [13], 
such a trace may bring the automaton to a state that is not an acceptance 
state. Hence, one can generally not conclude anything from the resulting 
state. A potential solution w'oukl be to pretend that an infinite sequence of 
stuttering transitions is appended to the trace, where a stuttering transition 
does not satisfy any proposition. One could then examine whether such a 
stuttering sequence would bring the automaton from the state(s) resulting 
from the finite trace, through an acceptance state infinitely often. Hence, the 
stuttering should be shown to “finish off" the automaton correctly. However, 
even though such an interpretation is possible, a different issue is that our 
finite trace semantics of the always operator □ is different from the infinite 
trace semantics implied by Biichi automata. 



Html. wo think th;u rln* dynamic programming im-rhudology that wo suggest 
v' i « ‘ I * l s more efficient testing fools t lian ours bused on Binlii automata, [n f;u*t, 
wn claim that it is hard, i! not impossible, to find more efficient algorithms 
than those* presented in tins paper. 

fn spite of their efficiency and elegance, the generated algorithms have a 
serious drawback: rhe execution trace* needs to be visited backwards. Tfiis is 
a typical phenomenon in dynamic programming algorithms which, taking into 
acrounr the* continuously decreasing price of storage media, doesn't seem to be 
a practical problem if one wants to first generate the events and them analyze 
them. However, we admit that it can be a crucial issue when one wants to analvze 
the events as they are generated, warning the programmer of errors or potential 
errors while his/her program is being executed. We were not able to find a 
dynamic programming algorithm that travels the trace forwards, but we are 
confident that it can be done and post it as a challenge for the interested reader, 
mentioning that it would have a great impact on testing methodologies and tools. 
It is worth mentioning here that we did find and implement an algorithm that 
visits rhe events in the order they were generated [II], but it is not as efficient 
as rhe dynamic programming algorithms presented in this paper. 

We d like to warmly thank Ranee Cleaveland. Dimifra GiannakopouJou and 
Willem \ isser for interesting and productive technical discussions directly re- 
lated to rhe effort in this paper, as well as Edmund Clarke. David Dill and 
Duron Dn is insky for general discussions on dynamic analysis of programs and 
its potential impact on computer aided verification. 

2 Finite Trace Linear Temporal Logic 

We briefly remind the reader the basic nor ions of finite trace linear temporal 
logic, including a recursive definition of the satisfaction relation between a finite 
trace and an LTL formula. The interested reader can check [11] for more on this 
subject. 

We regard a trace as a finite sequence of events emitted by the program that 
we want to observe. Such events could indicate when variables are changed or 
when locks are acquired or released. Note that this view is slightly different from 
the traditional view where the trace is a sequence of program states, each state 
denoting the set of propositions that hold at that state. This view is consistent 
with our goal to define an LTL observer as a process that is detached from the 
program to be analyzed, receiving only observed events. To keep the presenta- 
tion simple and our results general, we abstract away from the concrete contents 
of events and just define events as atoms. Similarly, we consider the basic propo- 
sitions as simple as possible, also atoms, and say that a proposition a is satisfied 
by an event 6 if and only if a = b. In practice, one should necessarily consider 
appropriate notions of satisfaction of propositions by events or states generated 
by events. We consider that this is an interesting but too concrete problem de- 
pending upon the events that one wants to observe, so we do not approach it 
here. 


Formulas. Assume that Prop is a srt <>| atoms. railed atomic propositions. 
[ hen Fonnula is r in k free extrusion of Prop under the standard proposit iona! 
constants ami operators true. falsi- , . V .. .A together with 

fho classical temporal logic operators o_, O., and _ // _ whose meaning 

will he given later. 

Events and Traces. Suppose that Emit is a set of events. As we previously 
mentioned, for the time being we consider that Event — Prop is just a set 
of atoms. The set of finite traces is Emit ’ which we'll denote Trace, where 
f denotes the empty trace. Assume two partial functions head : Trace — > 
Emit and tail : Trace — * Trace for taking the head and tin' tail of a trace, 
respectively, and a total function length returning the length of a finite trace. 
That is. head(e t) — e. tail(e t) — t, and length^} = 0 and length(e t) = 

1 4- length(t). Assume further for any trace t - €[€y...e n that L, for some 
natural number 1 < i < n, denotes the suffix trace e,e,+ i...e n that starts at 
position /, and that t n + { = e; if t = e then n - 0 and t x = e. 

Satisfaction. The satisfaction relation f= C Trace x Formula defines when a 
trace t satisfies a formula p, written t p, and is defined inductively over 
tin* structure of the formulae as follows, where p € Prop is anv atomic 
proposition and p and c are any formulae: 
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t j= Oy 

iff 

(3 1 < / < length(t) -f 1} j= p . 

t\= pit V 

iff 

(3 I < / < length(t) 4- 1) t t |= 0 and 
IV 1 <j< i ) tj |= p. 


The LTL operators have a slightly different interpretation in the context of finite 
traces, though similar in spirit to their standard semantics in classical LTL with 
infinite traces. The formula op (next p) holds for a finite trace iff the trace is 
nonempty and p holds in the suffix trace starting in the next (the second) time 
point. The formula Op (always p) holds if p holds in all time points, while Op 
(eventually p) holds if p holds in present or in some future time point. The 
formula p U 0 (v? until 0) holds if 0 holds in present or in some future time 
point, and until then p holds. As an example illustrating the semantics, the 
formula □(*? — > O0) holds for a finite trace iff for any time point in the trace it 
holds that if p is true then eventually 0 is true. 

The reader probably noticed that i ranges from 1 to length(n) in the definition 
of the semantics of □, while it ranges from l to lengthen) + 1 in the case of O. 
This discrepancy is not a typo; it is because of the intended semantics of the two 
operators on the empty trace, that is, e J= Op for any formula p while e f= Op 
if and only if e p. We still don't know exactly if this is the most appropriate 
semantics of the two operators; it should be taken just as a subjective choice at 
this incipient stage, but we are certainly going to clarify this issue soon as we 


#-r mure practical experience with this new technology. However, the algor if iirus 
presented in this paper do nor. essentially depend on this choice. 

An irnpor taut observation which is crucial to the < [i *i< ^ n r of tlie dynamic 
pi ogramming generic algorithms presented later is that the relation ^=r can be 
defined re< ursively in tlu* context of finite tract's. We oniv need to consider the 
Temporal operators: 

* N °r i- s false , e / |= op iff t (= p, 

f N a T ? i-s true , t> t j=r □ ^ iff e / [= - and t (= Op. 

f t= <>r iff f N r ^ / (= O r - iff e t (= or / f= Q y \ 

f N r ff V’ iff « N t’ . e f 1= p t’ iff € t f= !/- or (c / |= y - and t \= p U v). 

3 An Example 

In this section we show how to generate dynamic programming code for a con- 
crete LTL formula. We think that this example would practically be sufficient 
for the reader to foresee our general algorithm presented in the next section. 

Let Q({p U q) -> 0(q or)) be an LTL formula and let ri<r 2 *--rio be 
its subformulae, in breadth- first order: 

r i = D ((p U q) -> Olq -> or)), 
r.’ = (p U q) — > O (q — > or), 
r .1 - p U q. 
r I = 0(q -> or). 
r-5 = P* 

r 6 ~ 7’ 

r7 = <7 or. 

r8 V' 

r 9 = or, 

t^io = r. 

Given any finite trace f = eiei>...e a of n events, one can recursively define a 
matrix s[l..n + 1, 1.10] of boolean values {0, 1}, with the meaning that s[i, J] = 1 
iff U 1= yj as follows: 

s[i, 10] = (e, == r) 
s[«,9] = a[t + 1,10] 

S MJ =(e,==q) 

$[«,7] = «[i,8] implies s[i',9] 

4 6] = {e,==q) 

*‘[i,5] = (e, == p) 

*•[(, 4] = s[i, 7] or s[i + 1, 4] 

s[i,3] = s[t,6] or (s[i, 5] and s[i + 1,3]) 

*'[i,2] = s[i, 3] implies 5(1, 4] 

«[». 1] = j[i,2] and a[t + 1, 1], 

for all i < n , where and, or, implies are ordinary boolean operations and == is 
the equality predicate, where s[n + 1, 1..10] are defined as below: 


s[/y f l, Mj| = f) 

"*■[// ■+■ l. 9| =0 

>[n +■ i..sj =0 

s 'f" 4- L 7] = -vf/7 4* 1.8] implies .s[n 4* 1,9] 

*[« + I . (ij = () 

s [/; + 1 . 5) = 0 

>■[// -r l . 4] — s[a + l. 7] 

*■[// 1.3] - *[« 4- l. G] 

• s [" 4- 1. 2] = .s[/i 4- l. 3] implies ,s[n 4 - L i] 

*[// 4-1.1] =1. 

An important observation is that, like in many other dynamic programming 
algorithms, one doesn’t have to store all the table s[l..n + 1 , L. 10 ], which would 
be quite large in practice; in this case, one needs only two lines, ,$(*, 1 .. 10 ] and 
s[i 4- 1. 1.10], which well write now and next from now on, respectively. It is 
now only a simple exercise to write up the following algorithm: 

Input: trace t — e 
nerf[ 10 ] f- 0 ; 

7iej*f[9] 4— 0; 
next[S] 4 - 0 : 

tu’T,*[7j r- nej*f[ 8 ] implies nej/[ 9 ]: 
nw7[G] 4- 0 ; 
nej*/[5j 4 - 0; 

/ze.rf[4] 4 — next[ 7 ]; 

uej'/f3] 4 — ne.r/[6]; 

ne.r/[2] 4 — ne:rf[3] implies next[A]: 

n«u.7[l] 4 — 1; 

for / = n downto 1 do { 

nor/flO] 4 - (e, == r); 
nou>[9] 4 - narf[l 0 ]; 

nou/fS] 4— (ei == < 7 ); 
now[7] 4 — nou(8] implies note[ 9 ]; 
now[6] 4 — (cj == q ); 
now{5] 4 - (ei =~ p)\ 
now[ 4] 4 — nou\7] or next[4]; 
now{3] 4 - now[6 ] or (nou>[5] and nexf[3]); 
now[2] 4 — mnc(3] implies nou\A\, 
noTi;[l] 4 — nou\ 2] and nexffl]; 
next 4 - now } 
output(nexf[I]); 

Given a fixed LTL formula, the analysis of this algorithm is straightforward. 
Its time complexity is £9(n) where n is the length of the input trace, the constant 
being given by the size of the LTL formula. The memory required is constant, 
since the length of the two arrays is the size of the LTL formula. However, one 
may want to also include the size of the formula, say m, into the analysis; then 


rh«* time complexity is <»f »vi« jiinI y (-){n ■ /// ) while the memory required is 2 m 
l)its. Tlie authors think that it’s hard to find an algorithm running faster than 
the above in practical situations. 

4 The Main Algorithm 

V\e now formally describe our algorithm that synthesizes a dynamic program- 
ming algorithm from an LTL formula. Our synthesizer is generic, tin 1 potential 
user being expected to adapt- it to his/her desired target language. Tin 1 algorithm 
consists of three main steps; 

Breadth First Search. The LTL formula should be first visited in breadth- 
first order to assign increasing numbers to subfornnilae as they are visited. 

Let p\ % p 2 p m be the list of all subformulae in BFS order. Because of the 

semantics of finite trace LTL, this step insures us that the truth value of 
h H t 'j be completely determined from the truth values of t , py 

for all j < f < m and the truth values of \= py for all j < f < m. 
This recurrence gives the order in which one should generate the code. 

Loop Initialization. Before we generate the 'Tor" loop, we should first ini- 
tialize the vector ne.r/[l..m]. which basically gives the truth values of the 
subformulae on the empty trace. According to the semantics of LTL, one 
should fill the vector next backwards. For a given m > j > i. next[j] is 
calculated as follows: 

- If r j is a variable then next[j] — 0. Notice that p nx is always a variable. 

In a more complex setting of LTL. containing more complex propositions 
than just propositional variables, one would have to evaluate pj in the 
context of the empty trace or of the final state generated by the trace of 
events: 

- If pj is for some j < / < w , then next[j] ~ not next{j’Y where not 
is the negation operation on booleans (bits); 

- If pj is p JX Op p j2 for some / < < m, then next[j] = next[j x ] op neit[j 2 ]< 

where Op is any propositional operation and op is its corresponding 
boolean operation; 

- If pj is opy then clearly next[j] = 0 according to the semantics of finite 
trace LTL; 

- If pj is Op r then next[j] = 1 because the empty trace satisfies “always” 
everything: 

- If pj is Opy then next[j\ = next\y] because there are no further events 
that could make py hold in the future: it must hold now; 

— If pj is p JX l ( pj 2 for some j < j x , j* < m , then next[j] ne.rf[/-_>] for the 
same reason as above. 

Loop Generation. Because of the dependences in the recursive definition of 
finite trace LTL satisfaction relation, one is expected to visit the trace back- 
wards, so the loop index will vary from n downto 1. The loop body will 
update/ calculate the vector now and in the end will move it into the vector 
next to serve as basis for the next iteration. At a certain iteration i, the 
vector now is updated also backwards as follows: 


U rj ' s 'V v;irial»l«' thru nou\j\ only < v tn Is on the event r, . Fn our 
sim | | version of LTL, now[j] is l if and only if r, = p r Fn a morn 
complex finite trarn LTL where p } was a proposition, one would lx* 
expected to evaluate in a state at moment /. 

FI is -v-y for some j < / < m. then now[j] = not now[j f ]: 

If r 'j is yy, Op for j < j { J, 2 < then now[j\ - noufj { ] op noin[j 2 ], 
where ()p is any propositional operation and op is its corresponding 
boolean operation: 

- u rj is Qjy then rww[j] = n ,>xt\j'] since y,- holds now if and only if yy 
hold at the previous step (which treated the next event, the i + 1-rh): 

- If r 'j is □ yy then now[j] - now[f] and next[j J because y, holds now if 
ami only if yy holds now and y hold at the previous iteration: 

If y is Oy. then now[j] = now{j f ] or next[j] because of similar reasons 
as above; 

If y is y t U y 3 for some j < jiJ-> < m , then because of the recursion 
at the end of Section 2, now[j] = non f^] or (now[j x ] and next\j\). 

After each iteration /. next[l\ tells whether the initial LTL formula is validated 
bv the trace e t e^i ...p n . Therefore, the desired output is ne.rf[l] after the last 
iteration. Putting all the above together, one can now write up the generic pseu- 
docode presented in the appendix which can be implemented very efficiently on 
any current platform. Since the DFS procedure is linear, the algorithm synthe- 
sizes a dynamic programming algorithm from an LTL formula in linear time 
with the size of the formula. 
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A Generic Pseudocode for the Synthesizer 

Hm* following generic program implements the technique discussed in dm paper 
It takes as input an LTL formula and generates a for loop which traverses tin 
trace of events backwards, thus validating or invalidating the formula. 

Ixet'T: LTL formula ^ 

outputf T nfu'T: trace t ~ ej ?•>...? n " ); 

l |,f r i * r - rm he all die subfornmlae of p is BFS order 

for j = rn downto l do { 

outputf’Viexfp, j , -] f- "); 
if is a variable then output ("Of); 
if 'rj = "V/ then output( u nof next[" j\ 
if y-j = *?ji vj 2 then output (“nextp Ji , u ] op next[~ 
if pj = oyy then output ( T); M ); 
if pj = Qpy then outputf'T;"): 
if pj - Op r then output ( *next[~ 
if rj = rj, // y ;* 2 then output^nexfp.j*, } 

output( l, for / = n downto 1 do {**}; 
for j = ;n downto 1 do { 

out put ( ooii^ . y. **| e— j * 

if pj is ft variable then output Me, = =\ 

if ^ - -iyy then outputMnoJ norrf'./, 

if rj — rji Tj 2 then output ( "Tum'fMi , “] op no/cp, /_>• "]:**): 
if rj ~ °rV then output ( •‘nextp . j\ **]:“); 
if pj = d vV then output ( . /. **] anrf nexfp. j. ): 
if Pj - Op j/ then output j\ “j or next['\ j. "]f ); 

‘f rj “ li then outputf *‘nou'p , j->, ’ L ] or /"nouT\ b. w ] anti 
nextf . J. -]);■’): } 
output! 1 * next <— mw }’*); 
output( "output nexf[l]f); 


where Op is any propositional connective and op is its corresponding boolean 
operator. 

The boolean operations used above are usually very efficiently implemented 
on any microprocessor and the vectors of bits next and now are small enough to 
be kept in cache. Moreover, the dependencies between instructions in the gener- 
ated Tor loop are simple to analyze, so a reasonable compiler can easily unfold 
or/and parallelize it to take advantage of machine’s resources. Consequently, the 
generated code is expected to run very fast. 


